Context Model Automata for Text Compression

نویسندگان

  • Pasi Fränti
  • Timo Hatakka
چکیده

Finite-state automata offer an alternative approach in the implementation of context models where the states in the automata cannot in general be assigned by a single context. Despite the potential of this approach, it makes the design of the modelling more problematic because the exact behaviour of the model is not known. Here we propose a simple formalism—context model automata (CMA)— that gives an exact interpretation for the minimum context belonging to each state in the automaton. The formalism is general enough to simulate context models such as PPM and GDMC. Using the CMA formalism as our tool, we study the behaviour of the above two context models.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cellular Automata Based Document Compression Technology for On-line Network Transmission

This paper reports an efficient document compression technology suitable for on-line network transmission. It identifies the different segments – such as text, image, and background within a scanned document. Three distinct compression techniques are developed around the regular structure of Cellular Automata (CA) for the compression of text, image and background segments to achieve better comp...

متن کامل

Average Linear Time and Compressed Space Construction of the Burrows-Wheeler Transform

The Burrows-Wheeler Transform is a text permutation that has revolutionized the fields of pattern matching and text compression, bridging the gap existing between the two. In this paper we approach the BWT-construction problem generalizing a well-known algorithm—based on backward search and dynamic strings manipulation—to work in a context-wise fashion, using automata on words. Let n, σ, and Hk...

متن کامل

Lexical Attraction for Text Compression

New methods of acquiring structural information in text documents may support better compression by identifying an appropriate prediction context for each symbol. The method of “lexical attraction” infers syntactic dependency structures from statistical analysis of large corpora. We describe the generation of a lexical attraction model, discuss its application to text compression, and explore i...

متن کامل

Compression Using Antidictionaries

We give a new text compression scheme based on Forbidden Words ("antidictionary"). We prove that our algorithms attain the entropy for equilibrated binary sources. One of the main advantage of this approach is that it produces very fast decompressors. A second advantage is a synchronization property that is helpful to search compressed data and to parallelize the compressor. Our algorithms can ...

متن کامل

Comparative study of Arithmetic and Huffman Compression Techniques for Enhancing Security and Effective Bandwidth Utilization in the Context of ECC for Text

In this paper, we proposed a model for text encryption using elliptic curve cryptography (ECC) for secure transmission of text and by incorporating the Arithmetic/Huffman data compression technique for effective utilization of channel bandwidth and enhancing the security. In this model, every character of text message is transformed into the elliptic curve points 1 / 4

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Comput. J.

دوره 41  شماره 

صفحات  -

تاریخ انتشار 1998